{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# An example of Multi-Armed Bandits in Seldon: Epsilon Greedy Algorithm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this notebook we will use a [Multi-Armed Bandits](https://en.wikipedia.org/wiki/Multi-armed_bandit) algorithm to deploy 3 models in parallel. The algorithm will observe rewards and learn to route requests to the best model as time goes by.\n",
    "\n",
    "Seldon's implementation of the Epsilon Greedy algorithm is open source and available in the Seldon Core examples [here](https://github.com/SeldonIO/seldon-core/blob/master/examples/routers/epsilon_greedy/EpsilonGreedy.py)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setting up the stage"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "What follows assumes you have a cluster running with kubernetes (RBAC anabled) and kubectl pointing at it. First we will start Helm and Seldon"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl -n kube-system create sa tiller\n",
    "!kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller\n",
    "!helm init --service-account tiller"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!helm install ../helm-charts/seldon-core-crd --name seldon-core-crd --set usage_metrics.enabled=true"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl create namespace mab"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!helm install ../helm-charts/seldon-core --name seldon-core \\\n",
    "        --set apife.service_type=LoadBalancer \\\n",
    "        --namespace mab"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl get svc -l app=seldon-apiserver-container-app -n mab"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Setup python code to do REST and gRPC requests. **Only run this when the LoadBalancer created by GCP for the seldon-apife is running**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from visualizer import get_graph\n",
    "import requests\n",
    "from requests.auth import HTTPBasicAuth\n",
    "import grpc\n",
    "try:\n",
    "    from commands import getoutput # python 2\n",
    "except ImportError:\n",
    "    from subprocess import getoutput # python 3\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "NAMESPACE=\"mab\"\n",
    "SELDON_API_IP=getoutput(\"kubectl get svc -n \"+NAMESPACE+\" -l app=seldon-apiserver-container-app -o jsonpath='{.items[0].status.loadBalancer.ingress[0].ip}'\")\n",
    "\n",
    "def get_token():\n",
    "    payload = {'grant_type': 'client_credentials'}\n",
    "    response = requests.post(\n",
    "                \"http://{}:8080/oauth/token\".format(SELDON_API_IP),\n",
    "                auth=HTTPBasicAuth('oauth-key', 'oauth-secret'),\n",
    "                data=payload)\n",
    "    token =  response.json()[\"access_token\"]\n",
    "    return token\n",
    "\n",
    "def rest_request(request):\n",
    "    token = get_token()\n",
    "    headers = {'Authorization': 'Bearer '+token}\n",
    "    response = requests.post(\n",
    "                \"http://{}:8080/api/v0.1/predictions\".format(SELDON_API_IP),\n",
    "                headers=headers,\n",
    "                json=request)\n",
    "    return response.json()\n",
    "    \n",
    "def send_feedback_rest(request,response,reward):\n",
    "    token = get_token()\n",
    "    headers = {\"Authorization\": \"Bearer \"+token}\n",
    "    feedback = {\n",
    "        \"request\": request,\n",
    "        \"response\": response,\n",
    "        \"reward\": reward\n",
    "    }\n",
    "    ret = requests.post(\n",
    "        \"http://{}:8080/api/v0.1/feedback\".format(SELDON_API_IP),\n",
    "        headers=headers,\n",
    "        json=feedback)\n",
    "    return ret.text"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Deploying the prediction graph"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The graph we will deploy is as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "get_graph(\"resources/epsilon_greedy.json\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the router we will use the image seldonio/mab_epsilon_greedy:1.0 that was built from the Epsilon Greedy model available in the examples. For the classifiers we will use the image seldonio/mock_classifier:1.0\n",
    "\n",
    "The complete json for the graph is as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!jq .spec.predictors[0].graph resources/epsilon_greedy.json"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The epsilon greedy router takes the following parameters:\n",
    " - \"n_branches\": Required. Must match the number of children of the router (3 in this case).\n",
    " - \"epsilon\": Optional, defaults to 0.1. The exploration parameter of the algorithm.\n",
    " - \"verbose\": Optional, defaults to False. Verbose printout in the kubernetes logs.\n",
    "\n",
    "Let's create the Seldon Deployment"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "!kubectl apply -f resources/epsilon_greedy.json -n mab"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl get seldondeployments seldon-deployment-example -o jsonpath=\"{.status}\" -n mab"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Routing Requests"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First let's build a prediction request that we will use throughout this tutorial:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "request = {\n",
    "    \"data\": {\n",
    "        \"ndarray\":[[0,0]],\n",
    "        \"names\":[\"feature_1\",\"feature_2\"]\n",
    "    }\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's send it to our deployed predictor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rest_request(request)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The response metadata contains a routing dictionary that indicates which branch in the graph was selected by the router.\n",
    "Before it has been sent any feedback, the Epsilon greedy algorithm will send 70% of the requests to branch 0, and 30% of the requests to the other branches (because we chose epsilon=0.3).\n",
    "\n",
    "To test this, let's send 100 requests and observe the distribution of routings:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "results = {0:0,1:0,2:0}\n",
    "for i in range(100):\n",
    "    response = rest_request(request)\n",
    "    route = response.get(\"meta\").get(\"routing\").get(\"eg-router\")\n",
    "    results[route]+=1\n",
    "for branch,n in results.items():\n",
    "    print(\"{} requests were sent to branch {}\".format(n,branch))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we will send feedback to the router about a particular prediction.\n",
    "Feedback takes the following form:\n",
    "```python\n",
    "{\n",
    "    request: # The original request\n",
    "    response: # The response sent by seldon\n",
    "    reward : # A float number representing a reward for the prediction\n",
    "    truth : # Optional\n",
    "}\n",
    "```\n",
    "To clarify, truth is for when you observe the actual value of the random variable you want to predict a posteriori. This is not used by the epsilon greedy router. All we need is the request, response and a binary reward.\n",
    "\n",
    "First, let's get a prediction and save the response:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "response = rest_request(request)\n",
    "response"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's send a negative feedback about this prediction"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "send_feedback_rest(request,response,reward=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The epsilon greedy router keeps track of the best branch according to the observed feedbacks.\n",
    "\n",
    "To test the behaviour of the algorithm, we will run a little simulation. In what follows we will do successive predictions and feedbacks, and send a reward of 1 every time the request was routed to branch 2, and a reward of 0 otherwise. We should observe that the algorithm starts sending requests mainly to branch 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "routes_history = []\n",
    "for _ in range(100):\n",
    "    response = rest_request(request)\n",
    "    route = response.get(\"meta\").get(\"routing\").get(\"eg-router\")\n",
    "    if route == 2:\n",
    "        send_feedback_rest(request,response,reward=1)\n",
    "    else:\n",
    "        send_feedback_rest(request,response,reward=0)\n",
    "    routes_history.append(route)\n",
    "\n",
    "plt.figure(figsize=(15,6))\n",
    "ax = plt.scatter(range(len(routes_history)),routes_history)\n",
    "ax.axes.xaxis.set_label_text(\"Incoming Requests over Time\")\n",
    "ax.axes.yaxis.set_label_text(\"Selected Branch\")\n",
    "plt.yticks([0,1,2])\n",
    "_ = plt.title(\"Branch Chosen for Incoming Requests\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can see that the algorithm very quickly figured out that branch 2 was the best one."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Working with predictions in batches"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So far we ignored the fact that you can get predictions in batches, but we only send a single number as a reward. What if the predictions returned deserve different rewards?\n",
    "\n",
    "This is handled very simply: the reward you send in the feedback should correspond to the average reward of the batch.\n",
    "\n",
    "For Example:\n",
    "I have a batch of 10 requests. The algorithm routes them to branch 0 and returns 10 predictions from model A.\n",
    "I observe that 4 of these predictions were accurate and 6 were wrong.\n",
    "The reward I should give to this batch of predictions is then 0.4."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "request = {\n",
    "    \"data\": {\n",
    "        \"ndarray\":[\n",
    "            [0,0],\n",
    "            [0,0],\n",
    "            [0,0],\n",
    "            [0,0],\n",
    "            [0,0],\n",
    "            [0,0],\n",
    "            [0,0],\n",
    "            [0,0],\n",
    "            [0,0],\n",
    "            [0,0]\n",
    "        ],\n",
    "        \"names\": [\"feature_1\",\"feature_2\"]\n",
    "    }\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "response = rest_request(request)\n",
    "response"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "send_feedback_rest(request,response,reward=0.4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When the epsilon greedy algorithm receives this reward associated to the 10 predictions, it will deduce that out of the 10 predictions, $0.4*10=4$ were good and $0.6*10 = 6$ were bad."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Tear down"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl delete -f resources/epsilon_greedy.json -n mab"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!helm delete seldon-core --purge"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!helm delete seldon-core-crd --purge"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
